The development of Tagged Uyghur Corpus
نویسندگان
چکیده
The history and development of Uyghur language is introduced. After a brief introduction to the development of Uyghur words, morphology and syntax, we explain our developing of a computer-aided contemporary Uyghur language tagging system. The coverage of this corpus, the resources building, the rules for syncopating and tagging etyma and termination, and the tagging of a corpus using a small tagset are explained. Some practical methods solving problems in Uyghur language tagging are also proposed. Key word: history and developmnet of Uyghur language, Uyghur tagged corpus, Uyghur language tagging system.
منابع مشابه
Corpus based coreference resolution for Farsi text
"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...
متن کاملThe Research and Development of Computer Aided Contemporary Uighur Language Tagging System
The research on the contemporary Uyghur information processing in the 20th century can be dated back to the beginning of 1980‘s. There are a lot of achievements up to now ,for example, fundamental theory and basic utility are established, various corpus are built, code standard is designed, Uyghur grammatical attribution research is carried out。Because the Uyghur language has its own nature,the...
متن کاملUyghur-Chinese Translation Disambiguation Method Research Based on Knowledge Automatic-Acquisition
This thesis studies the disambiguation method in Uyghur-Chinese translation, and proposes the design philosophy of automatic-acquisition in translation label library aiming at the deficiency of disambiguation corpus in Uyghur. It refers to the existing Uyghur-Chinese bilingual dictionary, Chinese corpus and the Internet, and acquires the corresponding Chinese translation label examples to Uyghu...
متن کاملPAYMA: A Tagged Corpus of Persian Named Entities
The goal in the named entity recognition task is to classify proper nouns of a piece of text into classes such as person, location, and organization. Named entity recognition is an important preprocessing step in many natural language processing tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art...
متن کاملپیکره اعلام: یک پیکره استاندارد واحدهای اسمی برای زبان فارسی
Named entity recognition (NER) is a natural language processing (NLP) problem that is mainly used for text summarization, data mining, data retrieval, question and answering, machine translation, and document classification systems. A NER system is tasked with determining the border of each named entity, recognizing its type and classifying it into predefined categories. The categories of named...
متن کامل